NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FIC-TSC: Learning Time Series Classification with Fisher Information Constraint

Chen, Xiwen; Zhu, Wenhui; Qiu, Peijie; Wang, Hao; Li, Huayu; Li, Zihan; Wang, Yalin; Sotiras, Aristeidis; Razi, Abolfazl (July 2025, ICML 2025)

Free, publicly-accessible full text available July 13, 2026
Sequence Complementor: Complementing Transformers for Time Series Forecasting with Learnable Sequences

Chen, Xiwen; Qiu, Peijie; Zhu, Wenhui; Li, Huayu; Wang, Hao; Sotiras, Aristeidis; Wang, Yalin; Razi, Abolfazl (February 2025, AAAI 2025)

Free, publicly-accessible full text available February 28, 2026
Learning on Bandwidth Constrained Multi-Source Data with MIMO-inspired DPP MAP Inference

Chen, Xiwen; Li, Huayu; Amin, Rahul; Razi, Abolfazl (July 2024, IEEE Transactions on Machine Learning in Communications and Networking)

Determinantal Point Process (DPP) is a powerful technique to enhance data diversity by promoting the repulsion of similar elements in the selected samples. Particularly, DPP-based Maximum A Posteriori (MAP) inference is used to identify subsets with the highest diversity. However, a commonly adopted presumption of all data samples being available at one point hinders its applicability to real-world scenarios where data samples are distributed across distinct sources with intermittent and bandwidth-limited connections. This paper proposes a distributed version of DPP inference to enhance multi-source data diversification under limited communication budgets. First, we convert the lower bound of the diversity-maximized distributed sample selection from matrix determinant optimization to a simpler form of the sum of individual terms. Next, a determinant-preserved sparse representation of selected samples is formed by the sink as a surrogate for collected samples and sent back to sources as lightweight messages to eliminate the need for raw data exchange. Our approach is inspired by the channel orthogonalization process of Multiple-Input Multiple-Output (MIMO) systems based on the Channel State Information (CSI). Extensive experiments verify the superiority of our scalable method over the most commonly used data selection methods, including GreeDi, Greedymax, random selection, and stratified sampling by a substantial gain of at least 12% reduction in Relative Diversity Error (RDE). This enhanced diversity translates to a substantial improvement in the performance of various downstream learning tasks, including multi-level classification (2%-4% gain in accuracy), object detection (2% gain in mAP), and multiple-instance learning (1.3% gain in AUC).
more » « less
Full Text Available
Rd-dpp: Rate-distortion theory meets determinantal point process to diversify learning data samples

Chen, Xiwen; Li, Huayu; Qiu, Peijie; Zhu, Wenhui; Amin, Rahul; Razi, Abolfazl (August 2024, IEEE/CVF Winter Conference on Applications of Computer Vision (WACV))

Selecting representative samples plays an indispensable role in many machine learning and computer vision applications under limited resources (e.g., limited communication bandwidth and computational power). Determinantal Point Process (DPP) is a widely used method for selecting the most diverse representative samples that can summarize a dataset. However, its adaptability to different tasks remains an open challenge, as it is challenging for DPP to perform task-specific tuning. In contrast, Rate-Distortion (RD) theory provides a way to measure task-specific diversity. However, optimizing RD for a data selection problem remains challenging because the quantity that needs to be optimized is the index set of the selected samples. To tackle these challenges, we first draw an inherent relationship between DPP and RD theory. Our theoretical derivation paves the way for taking advantage of both RD and DPP for a task-specific data selection. To this end, we propose a novel method for task-specific data selection for multi-level classification tasks, named RD-DPP. Empirical studies on seven different datasets using five benchmark models demonstrate the effectiveness of the proposed RD-DPP method. Our method also outperforms recent strong competing methods, while exhibiting high generalizability to a variety of learning tasks.
more » « less
Full Text Available
Enhancing Graph Neural Networks in Large-scale Traffic Incident Analysis with Concurrency Hypothesis

https://doi.org/10.1145/3678717.3691256

Chen, Xiwen; Boroujeni, Sayed_Pedram Haeri; Shu, Xin; Li, Huayu; Razi, Abolfazl (October 2024, ACM)

Full Text Available
InterFormer: Effective Heterogeneous Interaction Learning for Click-Through Rate Prediction

https://doi.org/10.1145/3746252.3761527

Zeng, Zhichen; Liu, Xiaolong; Hang, Mengyue; Liu, Xiaoyi; Zhou, Qinghai; Yang, Chaofei; Liu, Yiqun; Ruan, Yichen; Chen, Laming; Chen, Yuxin; et al (November 2025, ACM)

Free, publicly-accessible full text available November 10, 2026
TimeMIL: Advancing Multivariate Time Series Classification via a Time-aware Multiple Instance Learning

Chen, Xiwen; Qiu, Peijie; Zhu, Wenhui; Li, Huayu; Wang, Hao; Sotiras, Aristeidis; Wang, Yalin; Razi, Abolfazl (July 2024, ICML)

Deep neural networks, including transformers and convolutional neural networks (CNNs), have significantly improved multivariate time series classification (MTSC). However, these methods often rely on supervised learning, which does not fully account for the sparsity and locality of patterns in time series data (e.g., quantification of diseases-related anomalous points in ECG and abnormal detection in signal). To address this challenge, we formally discuss and reformulate MTSC as a weakly supervised problem, introducing a novel multiple-instance learning (MIL) framework for better localization of patterns of interest and modeling time dependencies within time series. Our novel approach, TimeMIL, formulates the temporal correlation and ordering within a time-aware MIL pooling, leveraging a tokenized transformer with a specialized learnable wavelet positional token. The proposed method surpassed 26 recent state-of-the-art MTSC methods, underscoring the effectiveness of the weakly supervised TimeMIL in MTSC. The code is available https://github.com/xiwenc1/TimeMIL.
more » « less
Full Text Available
Knowledge distillation under ideal joint classifier assumption

https://doi.org/10.1016/j.neunet.2024.106160

Li, Huayu; Chen, Xiwen; Ditzler, Gregory; Roveda, Janet; Li, Ao (May 2024, Neural Networks)

Full Text Available
DeScoD-ECG: Deep Score-Based Diffusion Model for ECG Baseline Wander and Noise Removal

https://doi.org/10.1109/JBHI.2023.3237712

Li, Huayu; Ditzler, Gregory; Roveda, Janet; Li, Ao (January 2023, IEEE Journal of Biomedical and Health Informatics)

Full Text Available
Targeted Data Poisoning Attacks Against Continual Learning Neural Networks

https://doi.org/10.1109/IJCNN55064.2022.9892774

Li, Huayu; Ditzler, Gregory (January 2022, IEEE/INNS International Joint Conference on Neural Networks)

Full Text Available

« Prev Next »

Search for: All records